home
***
CD-ROM
|
disk
|
FTP
|
other
***
search
/
The Arsenal Files 1
/
The Arsenal Files (Arsenal Computer).ISO
/
bbs
/
tm0401.txt
< prev
next >
Wrap
Text File
|
1994-01-23
|
5KB
|
132 lines
SEA Technical Memorandum #0401, ARC 6.02; General Archive Format
Last updated: April 27, 1989
Copyright 1989 by System Enhancement Associates, Inc.
ARC 6.02
General Archive Format
The ARC file archive format, created by System Enhancement Associates in
March of 1985, has previously been documented primarily by the ARC sources
themselves. The purpose of this document is to provide a separate overview
of the construction and format of an ARC format archive. It is not our
intent to document the actual compression algorithms themselves in this
document. Those remain defined by the ARC sources.
An ARC format archive consists of one or more archive entries where each
entry begins with an "entry header". This header is typically followed by
data that applies to the header. In the usual case of a header that
identifies a compressed file, the data is the compressed file.
There are two general categories of entry headers, compressed files and
control information. Every header without exception begins with an entry
header marker, which is a single byte with a value of 26 decimal (1A hex,
"control Z"). This marker byte is immediately followed by a one byte
"header type code" that identifies the format and type of the header which
follows.
The majority of all entry headers will be for "standard compressed files",
and will have the following format:
Offset Length Description
------ ------ -----------
0 13 Null-terminated filename
14 4 Size of the compressed data, in bytes
18 2 Creation date, in MS-DOS format
20 2 Creation time, in MS-DOS format
22 2 Cyclical redundancy check value (CRC)
24 4 True length of uncompressed file
This is referred to in our documentation as a "standard header". In almost
all cases an entry header is made to follow the format of a standard header
as much as possible. At this time it is possible to treat any header that
is encountered as if it were a standard header, with two exceptions:
* A type one header is an obsolete form of a type two header
(uncompressed file), which is four bytes shorter. A type one header
may be converted to a type two header on input by (a) reading four
bytes less than the full header size, and then (b) setting the size of
the uncompressed file equal to the size of the compressed data.
* A type zero header marks the end of an archive, and has no header data.
I.e. an archive will end with an archive marker byte followed by a zero
byte.
Thus, the process for scanning through an ARC format archive picking out
entry headers may be summed up as follows:
1) Read one byte for the archive entry marker. If it's not an archive
entry marker, then exit with an error condition.
2) Read one byte for the archive header type.
3) If the entry type is zero, stop.
4) If the entry type is one, read 24 bytes of header data. Then set
uncompressed size equal to compressed size.
5) If the entry type is anything else, read 28 bytes of header data.
6) Do whatever you had in mind with the header data.
7) Perform a "relative seek" forward, skipping a number of bytes equal
to the compressed data size. Return to step (1).
Header types twenty and up identify extended information, and are described
in TM0402, "ARC 6.02; Extended Data". Standard compressed files are
identified as header types one through ninteen, as follows:
Type Compression method
---- ------------------
1 No compression (short header)
2 No compression (standard header)
3 Repeated Character Compression
4 RCC followed by Huffman
5 12 bit Lempel-Ziv
6 RCC followed by 12 bit Lempel-Ziv
7 RCC followed by 12 bit Lempel-Ziv, alternate hash function
8 RCC followed by variable 12 bit Lempel-Ziv with dynamic reset
9 variable 13 bit Lempel-Ziv with dynamic reset (nonstandard)
10+ Reserved for future use
These compression methods all have common names associated with them, as
follows:
Type Common name
---- -----------
1 stored
2 Stored
3 Packed
4 Squeezed
5 crunched
6 crunched
7 crunched
8 Crunched
9 Deviant
10+ Other
Conclusion
==========
We hope that anyone seeking to work with ARC format archives finds this
information of use. If we've left out anything you require, please feel
free to contact us. We can be reached by voice between 9 AM and 5 PM
Eastern time at (201) 473-5153. You can also leave a message for us on our
customer support bulletin board at (201) 473-1991. This is a five-line
system that is available 24 hours a day at up to 2400 baud. We can also be
reached by mail at:
System Enhancement Associates, Inc.
21 New Street, Wayne NJ 07470